Search CORE

2,189 research outputs found

Recommended from our members

SpotLight: An Information Service for the Cloud

Author: Ouyang Xue
Publication venue: ScholarWorks@UMass Amherst
Publication date: 13/07/2016
Field of study

Infrastructure-as-a-Service cloud platforms are incredibly complex: they rent hundreds of different types of servers across multiple geographical regions under a wide range of contract types that offer varying tradeoffs between risk and cost. Unfortunately, the internal dynamics of cloud platforms are opaque in several dimensions. For example, while the risk of servers not being available when requested is critical in optimizing these risk-cost tradeoffs, it is not typically made visible to users. Thus, inspired by prior work on Internet bandwidth probing, we propose actively probing cloud platforms to explicitly learn such information, where each probe\u27\u27 is a request for a particular type of server. We model the relationships between different contracts types to develop a market-based probing policy, which leverages the insight that real-time prices in cloud spot markets loosely correlate with the supply (and availability) of fixed-price on-demand servers. That is, the higher the spot price for a server, the more likely the corresponding fixed-price on-demand server is not available. We incorporate market-based probing into SpotLight, an information service that enables cloud applications to query this and other data, and use it to monitor the availability of more than 4500 distinct server types across 9 geographical regions in Amazon\u27s Elastic Compute Cloud over a 3 month period. We analyze this data to reveal interesting observations about the platform\u27s internal dynamics. We then show how SpotLight enables two recently proposed derivative cloud services to select a better mix of servers to host applications, which improves their availability from 70-90% to near 100% in practice

ScholarWorks@UMass Amherst

Numerical investigation of the phase change in transpiration cooling with the VOF method

Author: Jiang Pei-Xue
Ouyang Xiao-Long
Publication venue: ECI Digital Archives
Publication date: 06/07/2016
Field of study

Transpiration cooling with phase change is numerically investigated in the present work. As shown in Figure 1, a liquid coolant flow is injected into a porous medium from the bottom side. The porous medium receives heat from the hot gas on the top surface and heats the coolant. Thus, phase change can occur in this porous medium. The surface temperature, the heat flux received by the porous medium, the phase distribution and the flow and cooling characteristics are the most important unknowns on this topic. Please download the full abstract below

Engineering Conferences International

Intelligent Straggler Mitigation in Massive-Scale Computing Systems

Author: Ouyang Xue
Publication venue: University of Leeds
Publication date: 01/03/2018
Field of study

In order to satisfy increasing demands for Cloud services, modern computing systems are often massive in scale, typically consisting of hundreds to thousands of heterogeneous machine nodes. Parallel computing frameworks such as MapReduce are widely deployed over such cluster infrastructure to provide reliable yet prompt services to customers. However, complex characteristics of Cloud workloads, including multi-dimensional resource requirements and highly changeable system environments, e.g. dynamic node performance, are introducing new challenges to service providers in terms of both customer experience and system efficiency. One primary challenge is the straggler problem, whereby a small subset of the parallelized tasks take abnormally longer execution time in comparison with the siblings, leading to extended job response and potential late-timing failure. The state-of-the-art approach to straggler mitigation is speculative execution. Although it has been deployed in several real-world systems with a variety of implementation optimizations, the analysis from this thesis has shown that speculative execution is often inefficient. According to various production tracelogs of data centers, the failure rate of speculative execution could be as high as 71%. Straggler mitigation is a complicated problem in its own nature: 1) stragglers may lead to different consequences to parallel job execution, possibly with different degrees of severity, 2) whether a task should be regarded as a straggler is highly subjective, depending upon different application and system conditions, 3) the efficiency of speculative execution would be improved if dynamic node performance could be modelled and predicted appropriately, and 4) there are other types of stragglers, e.g. those caused by data skews, that are beyond the capability of speculative execution. This thesis starts with a quantitative and rigorous analysis of issues with stragglers, including their root-causes and impacts, the execution environment running them, and the limitations to their mitigation. Scientific principles of straggler mitigation are investigated and new algorithms are developed. An intelligent system for straggler mitigation is then designed and developed, being compatible with the majority of current parallel computing frameworks. Combined with historical data analysis and online adaptation, the system is capable of mitigating stragglers intelligently, dynamically judging a task as a straggler and handling it, avoiding current weak nodes, and dealing with data skew, a special type of straggler, with a dedicated method. Comprehensive analysis and evaluation of the system show that it is able to reduce job response time by up to 55%, as compared with the speculator used in the default YARN system, while the optimal improvement a speculative-based method may achieve is around 66% in theory. The system also achieves a much higher success rate of speculation than other production systems, up to 89%

White Rose E-theses Online

Improved Decoding of Expander Codes

Author: Chen Xue
Cheng Kuan
Li Xin
Ouyang Minghui
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)
Publication date: 01/01/2022
Field of study

We study the classical expander codes, introduced by Sipser and Spielman [M. Sipser and D. A. Spielman, 1996]. Given any constants 0 < ?, ? < 1/2, and an arbitrary bipartite graph with N vertices on the left, M < N vertices on the right, and left degree D such that any left subset S of size at most ? N has at least (1-?)|S|D neighbors, we show that the corresponding linear code given by parity checks on the right has distance at least roughly {? N}/{2 ?}. This is strictly better than the best known previous result of 2(1-?) ? N [Madhu Sudan, 2000; Viderman, 2013] whenever ? < 1/2, and improves the previous result significantly when ? is small. Furthermore, we show that this distance is tight in general, thus providing a complete characterization of the distance of general expander codes. Next, we provide several efficient decoding algorithms, which vastly improve previous results in terms of the fraction of errors corrected, whenever ? < 1/4. Finally, we also give a bound on the list-decoding radius of general expander codes, which beats the classical Johnson bound in certain situations (e.g., when the graph is almost regular and the code has a high rate). Our techniques exploit novel combinatorial properties of bipartite expander graphs. In particular, we establish a new size-expansion tradeoff, which may be of independent interests

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Effective solid-to-fluid heat transfer coefficient in EGS reservoirs

Author: Jiang Pei-Xue
Ouyang Xiao-Long
Xu Rui-Na
Publication venue: ECI Digital Archives
Publication date: 23/06/2014
Field of study

The present work developed a three-equation local thermal non-equilibrium model to predict the effective solid-to-fluid heat transfer coefficient in the enhanced geothermal system reservoirs based on the volume averaging method. Due to the high rock-to-fracture size ratio, the solid thermal resistance effect in the internal rocks cannot be neglected in the effective solid-to-fluid heat transfer coefficient. The present three-equation local thermal non-equilibrium model can consider the dynamic variation of the solid thermal resistance in transient heat transfer by introducing the penetration temperature difference. The model was validated by comparison with pore-scale numerical simulations and macro-scale LTNE model numerical simulations. The results show that the three-equation local thermal non-equilibrium model has a high accurac

Engineering Conferences International

Validity of self-reported weight, height and resultant body mass index in Chinese adolescents and factors associated with errors in self-reports

Author: Cheng Yue
Dibley Michael J
Ouyang Xue
Yan Hong
Zhou Xiaoyan
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Validity of self-reported height and weight has not been adequately evaluated in diverse adolescent populations. In fact there are no reported validity studies conducted in Asian children and adolescents. This study aims to examine the accuracy of self-reported weight, height, and resultant BMI values in Chinese adolescents, and of the adolescents' subsequent classification into overweight categories. Methods Weight and height were self-reported and measured in 1761 adolescents aged 12-16 years in a cross-sectional survey in Xi'an city, China. BMI was calculated from both reported values and measured values. Bland-Altman plots with 95% limits of agreement, Pearson's correlation and Kappa statistics were calculated to assess the agreement. Results The 95% limits of agreement were -11.16 and 6.46 kg for weight, -4.73 and 7.45 cm for height, and -4.93 and 2.47 kg/m2 for BMI. Pearson correlation between measured and self-reported values was 0.912 for weight, 0.935 for height and 0.809 for BMI. Weighted Kappa was 0.859 for weight, 0.906 for height and 0.754 for BMI. Sensitivity for detecting overweight (includes obese) in adolescents was 56.1%, and specificity was 98.6%. Subjects' area of residence, age and BMI were significant factors associated with the errors in self-reporting weight, height and relative BMI. Conclusions Reported weight and height does not have an acceptable agreement with measured data. Therefore, we do not recommend the application of self-reported weight and height to screen for overweight adolescents in China. Alternatively, self-reported data could be considered for use, with caution, in surveillance systems and epidemiology studies.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ECM-IBS:A Chebyshev map-based broadcast authentication for Wireless Sensor Network

Author: Cao Yi
Ding Xuemei
Liu Junxiu
Liu Yunqi
Luo Yuling
Ouyang Xue
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/08/2019
Field of study

Edinburgh Research Explorer

Perphon: a ML-based Agent for Workload Co-location via Performance Prediction and Resource Inference

Author: Hu C
Ouyang J
Wo T
Xu J
Xue S
Yang R
Zhu J
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/11/2019
Field of study

Cluster administrators are facing great pressures to improve cluster utilization through workload co-location. Guaranteeing performance of long-running applications (LRAs), however, is far from settled as unpredictable interference across applications is catastrophic to QoS [2]. Current solutions such as [1] usually employ sandboxed and offline profiling for different workload combinations and leverage them to predict incoming interference. However, the time complexity restricts the applicability to complex co-locations. Hence, this issue entails a new framework to harness runtime performance and mitigate the time cost with machine intelligence: i) It is desirable to explore a quantitative relationship between allocated resource and consequent workload performance, not relying on analyzing interference derived from different workload combinations. The majority of works, however, depend on offline profiling and training which may lead to model aging problem. Moreover, multi-resource dimensions (e.g., LLC contention) that are not completely included by existing works but have impact on performance interference need to be considered [3]. ii) Workload co-location also necessitates fine-grained isolation and access control mechanism. Once performance degradation is detected, dynamic resource adjustment will be enforced and application will be assigned an access to specific slices of each resources. Inferring a "just enough" amount of resource adjustment ensures the application performance can be secured whilst improving cluster utilization. We present Perphon, a runtime agent on a per node basis, that decouples ML-based performance prediction and resource inference from centralized scheduler. Figure 1 outlines the proposed architecture. We initially exploit sensitivity of applications to multi-resources to establish performance prediction. To achieve this, Metric Monitor aggregates application fingerprint and system-level performance metrics including CPU, memory, Last Level Cache (LLC), memory bandwidth (MBW) and number of running threads, etc. They are enabled by Intel-RDT and precisely obtained from resource group manager. Perphon employs an Online Gradient Boost Regression Tree (OGBRT) approach to resolve model aging problem. Res-Perf Model warms up via offline learning that merely relies on a small volume of profiling in the early stage, but evolves with arrival of workloads. Consequently, parameters will be automatically updated and synchronized among agents. Anomaly Detector can timely pinpoint a performance degradation via LSTM time-series analysis and determine when and which application need to be re-allocated resources. Once abnormal performance counter or load is detected, Resource Inferer conducts a gradient ascend based inference to work out a proper slice of resources, towards dynamically recovering targeted performance. Upon receiving an updated re-allocation, Access Controller re-assigns a specific portion of the node resources to the affected application. Eventually, Isolation Executor enforces resource manipulation and ensures performance isolation across applications. Specifically, we use cgroup cpuset and memory subsystem to control usage of CPU and memory while leveraging Intel-RDT technology to underpin the manipulation of LLC and MBW. For fine-granularity management, we create different groups for LRA and batch jobs when the agent starts. Our prototype integration with Node Manager of Apache YARN shows that throughput of Kafka data-streaming application in Perphon is 2.0x and 1.82x times that of isolation execution schemes in native YARN and pure cgroup cpu subsystem

White Rose Research Online